Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
This work introduces Private Eyes, the first zero-leakage biometric database. The only leakage of the system is unavoidable: 1) the log of the dataset size and 2) the fact that a query occurred. Private Eyes is built from oblivious symmetric searchable encryption. Approximate proximity queries are used: given a noisy reading of a biometric, the goal is to retrieve all stored records that are close enough according to a distance metric. Private Eyes combines locality sensitive-hashing or LSHs (Indyk and Motwani, STOC 1998) and oblivious maps which map keywords to values. One computes many LSHs of each record in the database and uses these hashes as keywords in the oblivious map with the matching biometric readings concatenated as the value. At search time with a noisy reading, one computes the LSHs and retrieves the disjunction of the resulting values from the map. The underlying oblivious map needs to answer disjunction queries efficiently. We focus on the iris biometric which requires a large number of LSHs, approximately 1000. Boldyreva and Tang’s (PoPETS 2021) design yields a suitable map for a small number of LSHs (their application was in zeroleakage k-nearest-neighbor search). Our solution is a zero-leakage disjunctive map designed for the setting when most clauses do not match any records. For the iris, on average at most 6% of LSHs match any stored value. We evaluate using the ND-0405 dataset; this dataset has 356 irises suitable for testing. To scale our evaluation, we use a generative adversarial network to produce synthetic irises. Accurate statistics on sizes beyond available datasets is crucial to optimizing the cryptographic primitives. This tool may be of independent interest. For the largest tested parameters of a 5000 synthetic iris database, a search requires 18 rounds of communication and 25ms of parallel computation. Our scheme is implemented and open-sourced.more » « lessFree, publicly-accessible full text available June 4, 2026
-
Biometric databases collect people's information and perform proximity search (finding records within bounded distance of the query) with few cryptographic protections. This work studies proximity searchable encryption applied to the iris biometric. Prior work proposed to build proximity search from inner product functional encryption (Kim et al., SCN 2018). This work identifies and closes two gaps in this approach: 1. Biometrics use long vectors, often with thousands of bits. Many inner product encryption schemes have to invert a matrix whose dimension scales with this size. Setup is then not feasible on commodity hardware. We introduce a technique that improves setup efficiency without harming accuracy. 2.Prior approaches leak distance between queries and all stored records. We propose a construction from function hiding, predicate, inner product encryption (Shen et al., TCC 2009) that avoids this leakage. Finally, we show that our scheme can be instantiated using symmetric pairing groups, which improves search efficiency.more » « less
-
Biometric databases collect people's information and allow users to perform proximity searches (finding all records within a bounded distance of the query point) with few cryptographic protections. This work studies proximity searchable encryption applied to the iris biometric. Prior work proposed inner product functional encryption as a technique to build proximity biometric databases (Kim et al., SCN 2018). This is because binary Hamming distance is computable using an inner product. This work identifies and closes two gaps to using inner product encryption for biometric search: Biometrics naturally use long vectors often with thousands of bits. Many inner product encryption schemes generate a random matrix whose dimension scales with vector size and have to invert this matrix. As a result, setup is not feasible on commodity hardware unless we reduce the dimension of the vectors. We explore state of the art techniques to reduce the dimension of the iris biometric and show that all known techniques harm the accuracy of the resulting system. That is, for small vector sizes multiple unrelated biometrics are returned in the search. For length 64 vectors, at a 90% probability of the searched biometric being returned, 10% of stored records are erroneously returned on average. Rather than changing the feature extractor, we introduce a new cryptographic technique that allows one to generate several smaller matrices. For vectors of length 1024 this reduces time to run setup from 23 days to 4 minutes. At this vector length, for the same $90%$ probability of the searched biometric being returned, .02% of stored records are erroneously returned on average. Prior inner product approaches leak distance between the query and all stored records. We refer to these as distance-revealing. We show a natural construction from function hiding, secret-key, predicate, inner product encryption (Shen, Shi, and Waters, TCC 2009). Our construction only leaks access patterns, and which returned records are the same distance from the query. We refer to this scheme as distance-hiding. We implement and benchmark one distance-revealing and one distance-hiding scheme. The distance-revealing scheme can search a small (hundreds) database in 4 minutes while the distance-hiding scheme is not yet practical, requiring 3.5 hours.more » « less
-
null (Ed.)Fuzzy extractors derive stable keys from noisy sources. They are a fundamental tool for key derivation from biometric sources. This work introduces a new construction, code offset in the exponent. This construction is the first reusable fuzzy extractor that simultaneously supports structured, low entropy distributions with correlated symbols and confidence information. These properties are specifically motivated by the most pertinent applications – key derivation from biometrics and physical unclonable functions – which typically demonstrate low entropy with additional statistical correlations and benefit from extractors that can leverage confidence information for efficiency. Code offset in the exponent is a group encoding of the code offset construction (Juels and Wattenberg, CCS 1999). A random codeword of a linear error-correcting code is used as a one-time pad for a sampled value from the noisy source. Rather than encoding this directly, code offset in the exponent encodes by exponentiation of a generator in a cryptographically strong group. We introduce and characterize a condition on noisy sources that directly translates to security of our construction in the generic group model. Our condition requires the inner product between the source distribution and all vectors in the null space of the code to be unpredictable.more » « less
An official website of the United States government
